Random Forest for Label Ranking

نویسندگان

  • Yangming Zhou
  • Guoping Qiu
چکیده

Label ranking aims to learn a mapping from instances to rankings over a finite number of predefined labels. Random forest is a powerful and one of the most successfully general-purpose machine learning algorithms of modern times. In the literature, there seems no research has yet been done in applying random forest to label ranking. In this paper, We present a powerful random forest label ranking method which uses random decision trees to retrieve nearest neighbors that are not only similar in the feature space but also in the ranking space. We have developed a novel two-step rank aggregation strategy to effectively aggregate neighboring rankings discovered by the random forest into a final predicted ranking. Compared with existing methods, the new random forest method has many advantages including its intrinsically scalable tree data structure, highly parallel-able computational architecture and much superior performances. We present extensive experimental results to demonstrate that our new method achieves the best predictive accuracy performances compared with state-of-the-art methods for datasets with complete ranking and datasets with only partial ranking information. keywords: Label ranking, random forest, decision tree, nearest neighbour

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature ranking for multi-label classification using predictive clustering trees

In this work, we present a feature ranking method for multilabel data. The method is motivated by the the practically relevant multilabel applications, such as semantic annotation of images and videos, functional genomics, music and text categorization etc. We propose a feature ranking method based on random forests. Considering the success of the feature ranking using random forest in the task...

متن کامل

Fast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies

Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...

متن کامل

Learning from multi-label data with interactivity constraints: An extensive experimental study

Interactive classification aims at introducing user preferences in the learning process to produce individualized outcomes more adapted to each user’s behaviour than the fully automatic approaches. The current interactive classification systems generally adopt a singlelabel classification paradigm that constrains items to span one label at a time and consequently limit the user’s expressiveness...

متن کامل

Ranking forests

The present paper examines how the aggregation and feature randomization principles underlying the algorithm Random Forest (Breiman (2001)) can be adapted to bipartite ranking. The approach taken here is based on nonparametric scoring and ROC curve optimization in the sense of the AUC criterion. In this problem, aggregation is used to increase the performance of scoring rules produced by rankin...

متن کامل

Robust Ranking Models using Noisy Feedback

Direct feedback of users of search engines by click information is naturally noisy. Ranking models that integrate such feedback in their training process must cope with this noise. In worst case such noise can lead to large variance among the results for different queries in the resulting rankings. We propose to integrate model averaging like bagging and random forest methods to reduce the vari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1608.07710  شماره 

صفحات  -

تاریخ انتشار 2016